76 research outputs found

    Attributed Multi-order Graph Convolutional Network for Heterogeneous Graphs

    Full text link
    Heterogeneous graph neural networks aim to discover discriminative node embeddings and relations from multi-relational networks.One challenge of heterogeneous graph learning is the design of learnable meta-paths, which significantly influences the quality of learned embeddings.Thus, in this paper, we propose an Attributed Multi-Order Graph Convolutional Network (AMOGCN), which automatically studies meta-paths containing multi-hop neighbors from an adaptive aggregation of multi-order adjacency matrices. The proposed model first builds different orders of adjacency matrices from manually designed node connections. After that, an intact multi-order adjacency matrix is attached from the automatic fusion of various orders of adjacency matrices. This process is supervised by the node semantic information, which is extracted from the node homophily evaluated by attributes. Eventually, we utilize a one-layer simplifying graph convolutional network with the learned multi-order adjacency matrix, which is equivalent to the cross-hop node information propagation with multi-layer graph neural networks. Substantial experiments reveal that AMOGCN gains superior semi-supervised classification performance compared with state-of-the-art competitors

    Multi-view Graph Convolutional Networks with Differentiable Node Selection

    Full text link
    Multi-view data containing complementary and consensus information can facilitate representation learning by exploiting the intact integration of multi-view features. Because most objects in real world often have underlying connections, organizing multi-view data as heterogeneous graphs is beneficial to extracting latent information among different objects. Due to the powerful capability to gather information of neighborhood nodes, in this paper, we apply Graph Convolutional Network (GCN) to cope with heterogeneous-graph data originating from multi-view data, which is still under-explored in the field of GCN. In order to improve the quality of network topology and alleviate the interference of noises yielded by graph fusion, some methods undertake sorting operations before the graph convolution procedure. These GCN-based methods generally sort and select the most confident neighborhood nodes for each vertex, such as picking the top-k nodes according to pre-defined confidence values. Nonetheless, this is problematic due to the non-differentiable sorting operators and inflexible graph embedding learning, which may result in blocked gradient computations and undesired performance. To cope with these issues, we propose a joint framework dubbed Multi-view Graph Convolutional Network with Differentiable Node Selection (MGCN-DNS), which is constituted of an adaptive graph fusion layer, a graph learning module and a differentiable node selection schema. MGCN-DNS accepts multi-channel graph-structural data as inputs and aims to learn more robust graph fusion through a differentiable neural network. The effectiveness of the proposed method is verified by rigorous comparisons with considerable state-of-the-art approaches in terms of multi-view semi-supervised classification tasks

    Predicting multiple functions of sustainable flood retention basins under uncertainty via multi-instance multi-label learning

    Get PDF
    The ambiguity of diverse functions of sustainable flood retention basins (SFRBs) may lead to conflict and risk in water resources planning and management. How can someone provide an intuitive yet efficient strategy to uncover and distinguish the multiple potential functions of SFRBs under uncertainty? In this study, by exploiting both input and output uncertainties of SFRBs, the authors developed a new data-driven framework to automatically predict the multiple functions of SFRBs by using multi-instance multi-label (MIML) learning. A total of 372 sustainable flood retention basins, characterized by 40 variables associated with confidence levels, were surveyed in Scotland, UK. A Gaussian model with Monte Carlo sampling was used to capture the variability of variables (i.e., input uncertainty), and the MIML-support vector machine (SVM) algorithm was subsequently applied to predict the potential functions of SFRBs that have not yet been assessed, allowing for one basin belonging to different types (i.e., output uncertainty). Experiments demonstrated that the proposed approach enables effective automatic prediction of the potential functions of SFRBs (e.g., accuracy >93%). The findings suggest that the functional uncertainty of SFRBs under investigation can be better assessed in a more comprehensive and cost-effective way, and the proposed data-driven approach provides a promising method of doing so for water resources management

    Robust automated detection of microstructural white matter degeneration in Alzheimer’s disease using machine learning classification of multicenter DTI data

    Get PDF
    Diffusion tensor imaging (DTI) based assessment of white matter fiber tract integrity can support the diagnosis of Alzheimer’s disease (AD). The use of DTI as a biomarker, however, depends on its applicability in a multicenter setting accounting for effects of different MRI scanners. We applied multivariate machine learning (ML) to a large multicenter sample from the recently created framework of the European DTI study on Dementia (EDSD). We hypothesized that ML approaches may amend effects of multicenter acquisition. We included a sample of 137 patients with clinically probable AD (MMSE 20.6±5.3) and 143 healthy elderly controls, scanned in nine different scanners. For diagnostic classification we used the DTI indices fractional anisotropy (FA) and mean diffusivity (MD) and, for comparison, gray matter and white matter density maps from anatomical MRI. Data were classified using a Support Vector Machine (SVM) and a Naïve Bayes (NB) classifier. We used two cross-validation approaches, (i) test and training samples randomly drawn from the entire data set (pooled cross-validation) and (ii) data from each scanner as test set, and the data from the remaining scanners as training set (scanner-specific cross-validation). In the pooled cross-validation, SVM achieved an accuracy of 80% for FA and 83% for MD. Accuracies for NB were significantly lower, ranging between 68% and 75%. Removing variance components arising from scanners using principal component analysis did not significantly change the classification results for both classifiers. For the scanner-specific cross-validation, the classification accuracy was reduced for both SVM and NB. After mean correction, classification accuracy reached a level comparable to the results obtained from the pooled cross-validation. Our findings support the notion that machine learning classification allows robust classification of DTI data sets arising from multiple scanners, even if a new data set comes from a scanner that was not part of the training sample

    New Techniques for Clustering Complex Objects

    Get PDF
    The tremendous amount of data produced nowadays in various application domains such as molecular biology or geography can only be fully exploited by efficient and effective data mining tools. One of the primary data mining tasks is clustering, which is the task of partitioning points of a data set into distinct groups (clusters) such that two points from one cluster are similar to each other whereas two points from distinct clusters are not. Due to modern database technology, e.g.object relational databases, a huge amount of complex objects from scientific, engineering or multimedia applications is stored in database systems. Modelling such complex data often results in very high-dimensional vector data ("feature vectors"). In the context of clustering, this causes a lot of fundamental problems, commonly subsumed under the term "Curse of Dimensionality". As a result, traditional clustering algorithms often fail to generate meaningful results, because in such high-dimensional feature spaces data does not cluster anymore. But usually, there are clusters embedded in lower dimensional subspaces, i.e. meaningful clusters can be found if only a certain subset of features is regarded for clustering. The subset of features may even be different for varying clusters. In this thesis, we present original extensions and enhancements of the density-based clustering notion to cope with high-dimensional data. In particular, we propose an algorithm called SUBCLU (density-connected Subspace Clustering) that extends DBSCAN (Density-Based Spatial Clustering of Applications with Noise) to the problem of subspace clustering. SUBCLU efficiently computes all clusters of arbitrary shape and size that would have been found if DBSCAN were applied to all possible subspaces of the feature space. Two subspace selection techniques called RIS (Ranking Interesting Subspaces) and SURFING (SUbspaces Relevant For clusterING) are proposed. They do not compute the subspace clusters directly, but generate a list of subspaces ranked by their clustering characteristics. A hierarchical clustering algorithm can be applied to these interesting subspaces in order to compute a hierarchical (subspace) clustering. In addition, we propose the algorithm 4C (Computing Correlation Connected Clusters) that extends the concepts of DBSCAN to compute density-based correlation clusters. 4C searches for groups of objects which exhibit an arbitrary but uniform correlation. Often, the traditional approach of modelling data as high-dimensional feature vectors is no longer able to capture the intuitive notion of similarity between complex objects. Thus, objects like chemical compounds, CAD drawings, XML data or color images are often modelled by using more complex representations like graphs or trees. If a metric distance function like the edit distance for graphs and trees is used as similarity measure, traditional clustering approaches like density-based clustering are applicable to those data. However, we face the problem that a single distance calculation can be very expensive. As clustering performs a lot of distance calculations, approaches like filter and refinement and metric indices get important. The second part of this thesis deals with special approaches for clustering in application domains with complex similarity models. We show, how appropriate filters can be used to enhance the performance of query processing and, thus, clustering of hierarchical objects. Furthermore, we describe how the two paradigms of filtering and metric indexing can be combined. As complex objects can often be represented by using different similarity models, a new clustering approach is presented that is able to cluster objects that provide several different complex representations

    Two-Year Progress of Pilot Research Activities in Teaching Digital Thinking Project (TDT)

    Get PDF
    This article presents a progress report from the last two years of the Teaching Digital Thinking (TDT) project. This project aims to implement new concepts, didactic methods, and teaching formats for sustainable digital transformation in Austrian Universities’ curricula by introducing new digital competencies. By equipping students and teachers with 21st-century digital competencies, partner universities can contribute to solving global challenges and organizing pilot projects. In line with the overall project aims, this article presents the ongoing digital transformation activities, courses, and research in the project, which have been carried out by the five partner universities since 2020, and briefly discusses the results. This article presents a summary of the research and educational activities carried out within two parts: complementary research and pilot projects

    Rheumatoid arthritis - clinical aspects: 134. Predictors of Joint Damage in South Africans with Rheumatoid Arthritis

    Get PDF
    Background: Rheumatoid arthritis (RA) causes progressive joint damage and functional disability. Studies on factors affecting joint damage as clinical outcome are lacking in Africa. The aim of the present study was to identify predictors of joint damage in adult South Africans with established RA. Methods: A cross-sectional study of 100 black patients with RA of >5 years were assessed for joint damage using a validated clinical method, the RA articular damage (RAAD) score. Potential predictors of joint damage that were documented included socio-demographics, smoking, body mass index (BMI), disease duration, delay in disease modifying antirheumatic drug (DMARD) initiation, global disease activity as measured by the disease activity score (DAS28), erythrocyte sedimentation rate (ESR), C reactive protein (CRP), and autoantibody status. The predictive value of variables was assessed by univariate and stepwise multivariate regression analyses. A p value <0.05 was considered significant. Results: The mean (SD) age was 56 (9.8) years, disease duration 17.5 (8.5) years, educational level 7.5 (3.5) years and DMARD lag was 9 (8.8) years. Female to male ratio was 10:1. The mean (SD) DAS28 was 4.9 (1.5) and total RAAD score was 28.3 (12.8). The mean (SD) BMI was 27.2 kg/m2 (6.2) and 93% of patients were rheumatoid factor (RF) positive. More than 90% of patients received between 2 to 3 DMARDs. Significant univariate predictors of a poor RAAD score were increasing age (p = 0.001), lower education level (p = 0.019), longer disease duration (p < 0.001), longer DMARD lag (p = 0.014), lower BMI (p = 0.025), high RF titre (p < 0.001) and high ESR (p = 0.008). The multivariate regression analysis showed that the only independent significant predictors of a higher mean RAAD score were older age at disease onset (p = 0.04), disease duration (p < 0.001) and RF titre (p < 0.001). There was also a negative association between BMI and the mean total RAAD score (p = 0.049). Conclusions: Patients with longstanding established RA have more severe irreversible joint damage as measured by the clinical RAAD score, contrary to other studies in Africa. This is largely reflected by a delay in the initiation of early effective treatment. Independent of disease duration, older age at disease onset and a higher RF titre are strongly associated with more joint damage. The inverse association between BMI and articular damage in RA has been observed in several studies using radiographic damage scores. The mechanisms underlying this paradoxical association are still widely unknown but adipokines have recently been suggested to play a role. Disclosure statement: C.I. has received a research grant from the Connective Tissue Diseases Research Fund, University of the Witwatersrand. All other authors have declared no conflicts of interes
    • …
    corecore